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Abstract. This paper introduces an abstract notion of fragments of monadic 
second-order logic. This concept is based on purely syntactic closure proper- 
ties. We show that over finite words, every logical fragment defines a lattice of 
languages with certain closure properties. Among these closure properties are 
residuals and inverse C-morphisms. Here, depending on certain closure proper- 
ties of the fragment, C is the family of arbitrary, non-erasing, length-preserving, 
length-multiplying, or length-reducing morphisms. In particular, definability in 
a certain fragment can often be characterized in terms of the syntactic morphism. 
This work extends a result of Straubing in which he investigated certain restric- 
tions of first-order logic formulae. In contrast to Straubing's model-theoretic 
approach, our notion of a logical fragment is purely syntactic and it does not 
rely on Ehrenfeucht-Fraisse games. 

As motivating examples, we present (1) sl fragment which captures the stutter- 
invariant part of piecewise-testable languages and (2) an acyclic fragment of £2- 
As it turns out, the latter has the same expressive power as two- variable first- 
order logic FO^. 

1. Introduction 

A famous result of Biichi, Elgot, and Trakhtenbrot states that a language of finite words 
is regular if and only if it is definable in monadic second-order logic [U [71 [23] . Later Mc- 
Naughton and Papert considered first-order logic. They showed that a language is definable 
in first-order logic if and only if it is star- free [11] . It turned out that the class of first-order 
definable languages has a huge number of other characterizations; cf. [3]. Intuitively, a 
first-order definable language is easier to describe than a language which is not first-order 
definable. This leads to a natural notion of descriptive complexity inside the class of reg- 
ular languages: The simpler the formula to describe a language, the simpler the language. 
Pursuing this approach, there are several possible restrictions for formulae which come to 
mind. For example, one can restrict the quantifier depth, the alternation depth, the num- 
ber of variables, the set of atomic predicates, or the set of quantifiers, just to name a few. 
There are several problems connected with this approach towards descriptive complexity 
inside the class of regular languages. Firstly, simplicity of logical formulae is of course not a 
linear measure. And secondly, how do we test whether some language is definable in a given 
(infinite) class of formulae. 
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There is no general solution to the first problem. Nevertheless, in some cases one can 
compare the expressive power of classes of formulae. Trivially, if a class of formulae is 
contained in another class of formulae, then we also have containment for the respective 
classes of languages. In some cases however, surprising inclusions and equivalences between 
syntactically incomparable fragments are known. For example, Therien and Wilke have 
shown that a language is definable in first-order logic FO^ with only two different names 
for the variables if and only if it is definable with one quantifier alternation [21] . Note that 
this is a natural restriction since first-order logic with three variables already has the full 
expressive power of first-order logic 

The solution to the second problem is usually obtained by an effective algebraic charac- 
terization. Schiitzcnberger has shown that a language is star-free if and only if its syntactic 
monoid is aperiodic J^. Together with the result of McNaughton and Papert, this yields 
an effective characterization of definability in first-order logic, i.e., for a given regular lan- 
guage one can check whether this language is definable in first-order logic. This kind of 
correspondence between languages and finite monoids is formalized in Eilenberg's Variety 
Theorem [B], e.g., star-free languages correspond to finite aperiodic monoids. The main idea 
is the following. If a class of languages V has certain closure properties, then there exists 
a class of finite monoids V such that a language is in V if and only if its syntactic monoid 
is in V. Now, if membership in V is decidable, then membership in V becomes decidable 
because the syntactic monoid can be computed effectively. The closure properties required 
by Eilenberg's Variety Theorem are Boolean operations, residuals, and inverse morphisms. 
A class of languages with these closure properties is called a variety. There are several 
variants and extensions of this approach. Pin has shown that there is an Eilenberg corre- 
spondence between positive varieties and ordered monoids [12]. A positive variety is a class 
of languages closed under positive Boolean operations, residuals, and inverse morphisms. A 
C- variety (for some class of morphisms C) is a class of languages closed under Boolean opera- 
tions, residuals, and inverse C-morphisms. Straubing has given an Eilenberg correspondence 
between C-varieties and so-called stamps |18) . Here, decidability results usually rely on the 
syntactic morphism and not solely on the syntactic monoid. The work of Straubing was 
later amended by other results such as equational theories, positive C-varieties and a wreath 
product for C-varieties [H [TOl [H] . The most extensive generalization of Eilenberg's Variety 
Theorem is due to Gehrke, Grigorieff, and Pin [8\ They have shown that so-called lattices 
of languages admit an equational description. A lattice is a class of languages closed under 
positive Boolean operations. Depending on other closure properties such as residuals, the 
equational description of a lattice can be tightened. 

So, in order to apply the existing algebraic framework to a class of languages defined 
by some class of formulae, it is important that the resulting class of languages has closure 
properties like (positive) Boolean operations, residuals, and inverse (C-)morphisms. In this 
paper, we introduce a formal notion of logical fragment such that language classes defined 
by fragments admit such closure properties. In addition, almost all logical fragments in the 
literature also form fragments in the sense of this paper. We have chosen monadic second- 
order logic over words on a broad base of atoms as framework of formal logic because this 
setting exhausts most variants of first-order logic and monadic second-order logic found in 
the literature, cf. [3 E mUZl [H HH UHl 121 123^ . It includes atomic predicates for order, 
successor, and modular predicates as well as quantifiers for first-order, second-order and 
modular counting quantification. 

The usual approach to closure properties of logical fragments is either indirect (i.e., by 
showing equivalence with some class of languages for which the closure properties are already 
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known) or it relies on methods such as Ehrenfeucht-Frai'sse games; see e.g. [T7]. The most 
general result along this model-theoretic line of work is due to Straubing [THl Theorem 3]. 
For several combinations of restrictions within first-order logic (quantifier depth, number of 
variables, numerical predicates, and set of quantifiers), he showed closure under residuals 
and inverse C-morphisms. One can obtain Straubing's result from our main Theorem [T] 
Sometimes, model-theoretic methods are difficult to apply to modalities such as modular 
quantifiers. Our syntactic transformation in contrast, allows to treat modular quantifiers 
uniformly as one of many cases of how to compose formulae. Moreover, it is conceptually 
easy to extend fragments by modalities which we did not consider in this paper. 

Finally, we consider two examples which illustrate the formal notion of fragments intro- 
duced in this paper. Both examples cannot be easily described using Ehrenfeucht-Frai'sse 
games. The first example is BEi[<], i.e., Boolean combinations of positive existential first- 
order formulae using only <. This leads to stutter-invariant piecewise testable languages. 
The second example is a "syntactic" fragment of S2 which is expressively complete for two- 
variable first-order logic FO^. This restriction of S2 requires that the comparison graph 
be acyclic. The vertices of the comparison graph are the variables and the edges reflect 
the comparisons. The resulting characterization of FO^ is complementary to the result of 
Therien and Wilke who showed that FO^ and the "semantic" fragment A2 of E2 have the 
same expressive power |22) . 

For a concise presentation of the main results, most proofs were moved to the appendix. 

2. Preliminaries 

A language over an alphabet A is a subset of finite words in A*. The empty word is 1 
and = A* \s the set of finite nonempty words over A. The set A* of finite words 

over A is the free monoid generated by A. It is finitely generated if A is a finite set. A 
residual of a language L C ^* is a language of the form u^^Lv^^ — {w G A* \ uwv £ L} 
where u,v € A*. It is a left residual if w = 1 and it is a right residual if u — 1. Let 
h : B* — > A* be a morphism between free monoids. The inverse image h^^{L) of L 
under h is the language h~^{L) = {w G B* \ h{w) G L} over B. The morphism h is non- 
erasing (respectively, length-reducing, length-preserving) if for all b £ B we have h{b) G A~^ 
(respectively, h{b) G AU{1}, h{b) G A). The morphism h is length-multiplying if there exists 
m G N such that h{b) G A™ for all h £ B. Note that by the universal property of free monoids, 
a morphism between free monoids is completely determined by its images of the letters. If 
C is a family of morphisms, then /i is a C-morphism ii h £ C. We introduce the following 
families of morphisms; all morphisms Caii between finitely generated free monoids, non- 
erasing morphisms C„e, length-multiplying morphisms C;™, length-reducing morphisms Cir, 
length-preserving morphisms Cip. 

Logic over Words. We consider monadic second-order logic interpreted over finite words. 
In the context of logic, words are viewed as labeled linear orders. Positions are positive 
integers with 1 being the first position. Labels come from a fixed countable universe of letters 
A. The set of variables is Vi U V2 where Vi is an infinite set of first-order variables and V2 
is an infinite set of second-order variables. First-order variables range over positions of the 
word and are denoted by lowercase letters {e.g., x,y,Xi G Vi) whereas second-order variables 
range over subsets of positions and are denoted by uppercase letters {e.g., X,Y,Xi G V2). 
Atomic formulae include 
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• the constants T (for true) and _L (for false), 

• the 0-ary predicate "empty" which is only true for the empty model, 

• the label predicate X{x) = a holds if the position of x is labeled with a € A, 

• the second-order predicate x ^ X which is true if x is contained in X 
and the following numerical predicates: 

• the first-order equality predicate x = y, 

• the (strict and non-strict) order predicates x < y and x < y, 

• the successor predicate suc(x,2/) with the interpretation x + I = y, 

• the minimum and maximum predicate min(a;) and max(a;) which identify the first and 
the last position, respectively, 

• the modular predicate x = r (mod q) which is true if the position of x is congruent to 
r modulo q. 

Formulae ip and can be composed by the Boolean connectives {i.e., by negation -k/j, 
disjunction 93 V and conjunction ip A ip), by existential and universal first-order quan- 
tification 3a; if and Vx ip, by existential and universal second-order quantification 3X ip and 
VX ip, and by modular counting quantification 3"^ ""'^ 'cc ip. The latter is true if, modulo q, 
there are r positions for x which make ip true. Parentheses may be used for disambiguation 
and to increase readability. The set FV(</?) C Vi U V2 of free variables of if is defined as 
usual. A sentence is a formula without free variables. 

We only give a sketch of the formal semantics of formulae. A precise definition can be 
found in Appendix \^ In the course of the evaluation of a formula, it is necessary to handle 
formulae with free variables. The idea is to encode their interpretation by enlarging the 
alphabet to include sets of variables. A first-order variable evaluates to a position i if the 
label of i contains the name of this variable. Similarly, a position i is contained in the 
evaluation of a second-order variable if the variable name is contained in the label of i. 
Specifically, if (/? is a formula and is a set of variables such that FY{ip) C V, then the 
semantics |<^|y is a set of words w = (ai, Ji) • ■ • (a„, J„) with a; e A and J; C 1/ such that 
for every first-order variable x there exists exactly one position i such that x ^ Ji. The 
interpretation of a free first-order variable x is then given by x(w) = i for this unique index. 
For a second-order variable X the interpretation X{w) = {i e {1, . . . , n} | X e Ji} is the 
set of positions containing X. With this, it is straightforward to define the semantics so as 
to coincide with the intuition given above. 

We define the following particular classes of formulae: 

• MSOmod is the class of all formulae. 

• MSO is the class of all formulae without the quantifier 3*" """"^ 

• FOmod is the class of all first-order formulae including modular quantifiers {i.e., with- 
out second-order variables). 

• FO is the class of all formulae in FOmod without the quantifier 3'' 

Let be a class of formulae. For a set V C {empty, <,<,=, sue, min, max, =} of predicates 
denote by J-[V] the class of formulae in T which (apart from T, _L, labels, and atomic 
formulae of the form x G X) only use predicates in V. This notation is refined by the 
following. For a class of V of atomic formulae let T[V] be the class of formulae in T which 
(apart from T, _L, labels, and atomic formulae of the form x G X) only uses atomic formulae 
in v. For example, FO[<] consists of all first-order formulae which only use atomic formulae 
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of the form T, _L, A(x) — a and x < y for arbitrary x,y E Vi, whereas FO[a;i < X2\ only 
ahows atomic formulae T, _L, A(a;) = a and xi < X2, but no order comparison of any other 
first-order variables. We also say that F contains some specific predicate (like the successor 
predicate) if there is a formula in F which uses this predicate. 

Logic and Languages. For a sentence (p and an alphabet A C A the language (over A) 
defined by is the set La{p) — {ai • • • a„ | (ai, 0) • • • (a„, 0) £ |'/']a,0}- It is the projection 
onto the letter-component of If the alphabet is clear from the context, then we drop 

it from the subscript and write L{ip). For a class of formulae T the class of languages C{!F) 
defined by T maps every finite alphabet A C A to the set 

Ca{F) — {La{}P) I e is a sentence} 

of languages over A. For a class of languages Q the class of languages defined by T over Q is 
the class of languages mapping A to Ca(,F)^Qa- Specifically, the class of languages defined 
by T over nonempty words maps A to Ca{F) H Note that in La{p>), the alphabet A and 
the set of labels used in a formula ip may well be incomparable; a label predicate A(x) = a 
with a ^ A will always be false when considering the semantics over A. On the other hand, a 
formula need of course not use all labels of the alphabet over which structures are built. For 
example, consider the formula 3a:: A(a;) = a requiring that there be an a-position. If a ^ A, 
then LaIp) = because all positions of a word w over A are non-a-positions; interpreted 
over the alphabet A = {a,b} this formula defines the language A*aA*. This might seem 
unintuitive at first glance but allows a more uniform handling of languages over different 
alphabets and avoids tedious notation and many case-distinctions. 

Fragments. In this section fragments are introduced as classes of formulae with natural 
closure properties on the syntactic level. As we shall see in Section [31 these syntactic 
properties transfer to closure under natural semantic operations. 

A context is a formula with a unique occurrence of an additional constant predicate o (to 
be read as "hole"). It is primitive if it does not use any label predicate. We shall denote 
primitive contexts by and contexts that a priori need not be primitive by v. The intuition 
is that o is a place-holder where a formula can be plugged in. Let ^{(p) be the result of 
substituting p for the unique occurrence of o in v. Contexts allow to elegantly describe 
subformulae as iy9 is a subformula of ^ if and only if there exists a context v such that 

Definition 1 A fragment J- is a nonempty class of formulae such that for all primitive 
contexts fj,, all formulae p, ip, all a G A and all x,y GYi : 

1. If fi{p) £ J-, then /i(T) G J- and fJ.{-L) G J- and /i(A(a;) = a) G 

2. fi{p V ip) E J- if and only if fi{p) G J- and ^{ip) G J- , 

3. ii{p A tp) G J- if and only if fi{p) G J- and fJ.{ip) G J-, 
4-. if fJ.{3x p) d J- and x ^ FV(iy9) , then ii{p) G J-. 

It is closed under negation if p E J- implies -ip £ J-. O 

Next, we give an intuition for fragments in terms of local substitution operations. Let J" 
be a class of formulae and let p and ip be formulae. The syntactic preorder of T is defined 
by p ■0 if m(V^) £ J' implies fJ.{p) G J- for every primitive context fi. Intuitively p <jr ip 
means that, with respect to the formula p is syntactically not more "complicated" than ip. 
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Similarly, we let (p <jr ip if £ J- implies ^{(p) G J- for all contexts v. The syntactic 
preorder allows to reformulate some of the axioms of a fragment. For example, property ((!]) 
in Definition [T] is equivalent to T <jr ip and ± <jr Lp and (A(x) — a) <jr ip for all formulae p. 
Note that ip 4' implies >p i^- The reverse is however not true for arbitrary classes of 
formulae. Let for example J- consist of all formulae containing at most one label predicate. 
In this case, we have (X{x) = aj <jr T. If v is the context o A X{x) = a, then i'{T) G T and 
i^[X{x) — a) ^ T . Hence (A(x) — aj -^jr T. For fragments on the other hand, this cannot 
happen because here, <jr and -<j: are equivalent by the following lemma. 

Lemma 1 If J- is a fragment, then (p <jr ip if and only if (p <jr ip. □ 

This provides an intuition for fragments: In a formula from a fragment one may replace 
arbitrary subformulae by <_7r-smaller formulae without leaving J-. Note that this is not 
immediate from the definition of a fragment because in general, primitive contexts are not 
sufficient to formalize subformulae (as the "rest" of the formula may contain label predicates). 
On the other hand it is also natural to attach an alphabet to a formula (which it is interpreted 
over) and in this case, primitive contexts do not interfere with the alphabet of the formula. 

3. Fragments and C-varieties 

This section summarizes semantic closure properties of fragments. In Proposition [T] and 
Proposition [5] we give conditions for a fragment to be closed under residuals and inverse 
morphisms, respectively. The combination of these two propositions gives our main result 
Theorem [T] which formulates closure properties of languages defined by fragments in terms 
of C-varieties. For closure under residuals we need some more assumptions. 

Definition 2 A fragment J- is suc-stable if for all primitive contexts fj, and all x,y £ Vi; 

1. If /i(suc(a;, y)) G J-, then ii{x = y) E T. 

2. If ^J,{suc{x,y)) G J-, then /i(max(2;)) G J- and /i(min(y)) G J-. 

3. // /x(min(a;)) £ T or «/ /i(max(a;)) G T , then /i(empty) G T . 

It is mod-stable if for all primitive contexts fi, all formulae p, all x eYi and all q,r £ Z: 

1. fi{x = r (mod q)) E J- if and only if ii{x = s (mod q)) G J- for all s G Z. 

2. /i(3'' '^xp) if and only if fi{3' ^x p) e F for all s G Z. 

3. If ^ii3'' "'"'^ 'ixip) eF and x ^ YY{p>), then ^{p) G J" and fi{^p) eT. O 

Consider the left residual, i.e., given a formula tp and a word w, we want to determine 
the truth value of p on aw. Conceptually, setting a variable to the "phantom" a-position in 
front of the word is handled syntactically resulting in a formula a~^(p defining the residual. 
To do this consistently, we keep track of these variables using the extended alphabet from 
the formal semantics of formulae. The above stability properties thereby allow to sustain 
a^^p <jr p as an invariant. The actual construction is rather lengthy and can be found in 
Appendix 

Proposition 1 Let T be a fragment and suppose that T is suc-stable and mod-stable. Then 
the class of languages defined by J- is closed under residuals. □ 

Note that if J- does not contain sue, max or min, then J- trivially is suc-stable. Similarly, 
F is mod-stable if it does not contain a modular predicate. 
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We now turn to closure under inverse morphisms. Here, we need the following additional 
properties of fragments. 

Definition 3 A fragment T is order-stable ij fi{x < y) G T if and only if fi{x < y) € T 
for all primitive contexts fj, and all x,y ^Wi. 

It is MSO-stable if for all primitive contexts fi, all formulae ip, all a; G Vi and all X,Y ^ W2: 

1. If n{x e X) e T, then n{x £ F) £ J". 

2. If n(3X if) e T, then fi{3Y 3X ip) e T. 

3. If n(yX p) e T, then fi{\fY VX pi) eT. O 

We obtain the result as follows. For every morphism h : B* A* and every formula p we 
construct a formula h~^{p) defining the inverse morphic image of La{p) with h~^{p) <jr p. 
Basically, a position i on h{w) can be represented by its corresponding position on w (called 
the origin of i) combined with some offset (bounded by the maximal length \h{h) \ for letters 
b € B). For first-order variables the offset is stored syntactically and second-order variables 
are distributed over several variables, depending on the offset. As for residuals, the actual 
construction is technically involved and can be found in Appendix [D] 

Typically, if a fragment T contains more modalities, then either J- has to satisfy more 
closure properties or it is closed under fewer inverse morphisms. This trade-off between 
closure properties and inverse morphisms is given by the implications in Proposition [51 each 
implication covering certain modalities in 

Proposition 2 Let J- be a fragment and let C be a family of morphisms between finitely 
generated free monoids. Suppose the following: 

1. If T contains a second-order quantifier, then J- is MSO-stable or all C-morphisms are 
length-reducing. 

2. If T contains the predicate < or <, then J- is order-stable or all C-morphisms are 
length-reducing. 

3. If T contains the predicate sue, min, max or empty, then all C-morphisms are non- 
erasing. 

4- If J' contains a modular predicate, then all C-morphisms are length-multiplying and 
either J- is mod-stable or all C-morphisms are length-preserving. 

5. If J- contains a modular quantifier, then J- is mod-stable or all C-morphisms are 
length-reducing. 

Then the class of languages defined by T is closed under inverse C-morphisms. □ 

In particular every fragment is closed under length-preserving morphisms. 

We now turn to C- varieties of which we only give the definition; for details see [HI [TH] . A 
category C of morphisms between finitely generated free monoids is a family of morphisms 
between finitely generated free monoids which contains the identity morphisms and which is 
closed under composition. A positive C -variety is a class of languages which is closed under 
positive Boolean combination, residuals and inverse C-morphisms. It is a C-variety if it is 
closed under complement. Examples for categories of morphisms include Caii, C„e, Cim, Cir, 
and Cip. 

Our main result is the next theorem from which in particular the main results of a paper 
by Straubing can be obtained |18l Theorem 3] . Intuitively, the more closure properties some 
fragment J- has, the larger is the class of inverse morphisms under which C{J-) is closed. In 
Theorem [T] below this is formalized by a sequence of implications. 
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Theorem 1 Let T he a mod-stable and suc-stable fragment. Let C be a category of mor- 
phisms between finitely generated free monoids. Suppose the following: 

1. Lf J- contains a second-order quantifier, then jF is MSO-stable or all C-morphisms are 
length-reducing. 

2. Lf J- contains the predicate < or <, then T is order-stable or all C-morphisms are 
length-reducing. 

3. Lf T contains the predicate sue, min, max or empty, then all C-morphisms are non- 
erasing. 

4- Lf J- contains a modular predicate, then all C-morphisms are length-multiplying. 
Then the class of languages defined by T is a positive C-variety. 

Proof: We have to show that C{J-) is closed under union, intersection, residuals and inverse 
C-morphisms. Using the primitive context o it is easy to see that T is closed under dis- 
junction and conjunction and, consequently, C{J-) is closed under union and intersection. 
It remains to show that is closed under residuals and inverse C-morphisms. Closure 

under residuals is Proposition [1] and closure under inverse C-morphisms is Proposition [2l □ 

A (positive) ^-variety is a (positive) Can-variety and a (positive) -{--variety is a (positive) 
C„e-variety of languages of nonempty words. We get the following corollaries for fragments 
using equality, order and successor. Note in particular that the predicate "empty" is void 
over nonempty words and that every first-order fragment trivially is MSO-stable. 

Corollary 1 Let T C MSO[<,<,=] be a fragment which is MSO-stable and order-stable. 
Then J- defines a positive 'if -variety. □ 

Corollary 2 Let T C MSO[<, <, =, sue, min, max] be an MSO-stable and order-stable frag- 
ment. Suppose min(y) <jr suc{x,y) and max(a;) <jr suc{x,y) for all first- order variables 
X and y. Then the class of languages defined by J- over nonempty words forms a positive 
-\- -variety. □ 



4. Stutter-Invariant Piecewise Testable Languages 

A language is a simple monomial if it is of the form A*ai- ■ ■ A*anA*. A language L C_ A* 
is piecewise testable if it is a finite Boolean combination of simple monomials. It is stutter- 
invariant if paq G L if and only if paaq G L for all a G A. 

Let El consist of all FO-formulae without negation and without any universal quantifier. 
Let BEi be the fragment which consists of all Boolean combinations of formulae in Ei. By 
Theorem [U the class of languages definable in BEi[<] forms a C;^- variety. The following 
proposition describes the class of languages definable in BEi [<] in terms of stutter-invariant 
piecewise testable languages. 

Proposition 3 Let L <Z A* be a language. The following are equivalent: 

1. L is definable in BEi[<]. 

2. L is piecewise testable and stutter-invariant. 

3. L is a Boolean combination of simple monomials of the form A*ai ■ ■ ■ A*anA* with 
ai ^ fli+i for all i. 
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Proof: We first show "(P) ^ @". If L is BEi [<]-definable, then of course L is Bi;i[<, =]- 
definable. The latter is equivalent to L being piecewise testable, see e.g. [5]- It is easy to 
see that the class of languages defined by Si[<] is stutter-invariant. The claim follows since 
stutter-invariant languages are closed under Boolean operations. 

=> ([3])": Let L C A* be piecewise testable and stutter- invariant. Since L is piecewise 
testable, we can write L = [Jl^i Pi \ U*=i Qj where Pi and Qj are simple monomials. 
Suppose P — {A*aiY^ ■ ■ ■ (yl*a„)'^"A* for positive integers e^, Ui & A and Ui ^ a^+i. Then 
red(P) = A*ai ■ ■ ■ A*anA* is the monomial obtained by discarding successive a^'s with the 
same label. Note that red(P) is stutter-invariant and P C red(P). It suffices to show L = 
Uj red(Pi)\lJj- red(Qj). For the containment from left to right assume u E L and u E red{Qj) 
for some j. Let red{Qj) = A*ai ■ • • A*anA* with ^ Oj+i and let u = uiOi ■ ■ ■ u„a„M„+i. 
Then there exist positive integers such that u' — uia^^ ■ ■ •u„a^"u„+i e Qj. Therefore, 
u' ^ L and, by stutter-invariance of L, we conclude u ^ L, a contradiction. For the converse 
let u G red(Pi) for some i such that u ^ [j^red{Qj). Let red(Pj) = A*ai---A*anA* 
with aj 7^ s^nd u = MiOi • • ■ M„a„u„-|-i. There exist positive integers such that 

u' — uia\^ ■ ■ ■ Una'^"Un+i E Pi and stutter-invariance of the red(Qj) yields u' ^ IJ^ Ted(Qj). 
In particular u' ^ IJ^ Qj and thus u' E L. By stutter-invariance of L we get u E L. 

"© ^ (O": Let P = A*ai ■ ■ ■ A*a„A* with a, ^ a^+i for aU i. Then P is defined by 

the formula 3xi ■ ■ - Bxn'. AlLi '^(■^0 ~ ^* ^ h\i=i — ^i+i- Note that in this formula, 
Xi < Xi-^-l implies Xi < Xi+i since a; ^ a^+i. □ 

A famous result of Simon says that a language L is piecewise testable if and only if the 
syntactic monoid of L is finite and J^-tnvial 1161 . The latter property is decidable for finite 
monoids. Moreover, L is stutter-invariant if and only if the image of every letter under the 
syntactic morphisms of L is idempotent. Combining these observations, (HJ shows that it is 
decidable whether a given regular language is definable in BI]i[<]. 

5. The Acyclic Fragment of S2 

Let E2 consist of all FO-formulae without negations such that there is no path in the parse- 
tree with an existential quantifier after a universal quantifier, i.e., on every path in the 
parse-tree all existential quantifiers occur before all universal quantifiers. The comparison 
graph of a formula ip is the directed graph G((/j) — (V, E) with V being the set of variables 
occurring in and {x, y) E E ii and only if one of the atomic formulae x < y, x < y, x — y or 
y = X occurs in ip. It is acyclic if there exist no xi, . . . ,Xn ^ V such that (xi, cc^+i) E E and 
Xn = xi. Note that the class of formulae in E2[<7 <] with an acyclic comparison graph forms 
an order-stable fragment thus defining a positive ^-variety. In fact, the following proposition 
implies that it defines a ^-variety even though, syntactically, it is not closed under negation. 

Theorem 2 A language is definable in FO^[<] if and only if it is definable by a formula in 
T,2[<, <] with an acyclic comparison graph. 

Proof: We only give an outline. The full proof can be found in Appendix [Fl The proof relies 
on two famous characterizations of the class of languages definable in FO^[<]. The first 
characterization is in terms of unions of unambiguous monomials and the second one is the 
variety DA of finite monoids; see [20l [5]. 

A language of the form P = AJai ■ • • A* a„A*_|^]^ with ai E A and Ai E A is called a mono- 
mial. It is unambiguous if every word u E P has a unique factorization u = uiai ■ ■ ■ M„a„M„+i 
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with Ui € A* . For the direction from left to right, it suffices to show that every unambigu- 
ous monomial P — A^ai ■ ■ ■ A^anA^_^_i is definable by a formula in S2[<, l£] with an acyclic 
comparison graph. There exists some Ui ^ Ai n An+i since otherwise (oi • • • a„)^ would 
admit two different factorizations. By symmetry, we can assume a.; ^ Ai. For every word 
u G P we consider the factorization u = qair such that does not occur in the prefix q. 
Then q and r are contained in smaller unambiguous monomials Q C [A\{ai\)* and R C A* , 
respectively, such that QatR C P. By induction, there exist formulae for Q and R. These 
formulae can be combined into a formula for P using so-called relativizations. The main 
idea here is that the position of the first is unique and that several variables can be used 
to identify this position. This allows to maintain an acyclic comparison graph. 

For the converse, we show that the syntactic monoid of L{if) is in DA if is in S2[<, <] 
with an acyclic comparison graph. For this, it suffices to show that for some sufficiently 
large integer n > 1, we have p{uv)'^u{uv)'^q G Lif) if and only if p{uv)^^q G L{lp) for all 
p,q,u,v G A* . It is easier to describe the outline of the proof using the terminology of 
Ehrenfeucht-Frai'sse games. We note that in this game, the winning condition is not defined 
in terms of isomorphisms of game situations and thus, it is not an Ehrenfeucht-Frai'sse game 
in the usual sense. Since every language definable in FO^[<] is also definable in E2[<, <], 
it follows that if Spoiler starts on the word p{uv)^"'q, then Duplicator wins for arbitrary 
comparison graphs [22]. Hence, Spoiler starts on Choosing n large enough, 

we know that after Spoiler placed his pebbles on p{uv)^u{uv)^q, there are large gaps to the 
left and to the right of the central factor u. Duplicator plays as follows: Pebbles outside 
the center are placed on the respective position on p{uv)^"q. For the pebbles in the central 
part. Duplicators strategy basically is to make as many atomic formulae true on p(uv)^"'q 
as possible. He can do this because the comparison graph is acyclic. In the second round 
Spoiler places his pebbles on p{uv)'^"'q. Exploiting acyclicity again. Duplicator can use the 
gaps on p{uv)"'u{uv)"'q to obtain a situation where as many atomic formulae as possible 
are false on p{uv)"'u{uv)^q. The result is a situation such that if Xi < Xj (respectively, 
Xi < Xj) on p{uv)"'u{uv)"'q implies Xi < xj (respectively, Xi < Xj) on p{uv)^"'q. Hence, 
p{uv)'^u{uv)'^q E L implies p{uv)^"'q E L. □ 



6. Conclusion 

We introduced fragments as classes of formulae with natural syntactic closure properties. 
Among others, these syntactic closure properties yield semantic closure under positive Boolean 
operations for the corresponding classes of languages, i.e., every fragment defines a lattice 
of languages. Our main result is that fragments often yield closure under residuals and in- 
verse morphisms. These properties lead to C-varieties, thus allowing algebraic descriptions 
in terms of the syntactic morphism. At the end of the paper, we considered two frag- 
ments which are not easily captured by traditional techniques such as Ehrenfeucht-Frai'sse 
games. The first example is the Boolean closure of Si[<]. This fragment corresponds to 
the stutter-invariant subclass of piecewise testable languages. The second example is a 
novel characterization of the FO^[<]-definable languages in terms of an acyclic fragment of 

S2[<,<]. 

We expect our constructions to be extensible to other structures such as infinite words. 
Another line of work would be to assign a reasonable syntactic object to a given fragment. 
The hope is that this object could be used for a general framework to solve questions like 
"is the class of languages defined by the fragment T closed under complement?" or "is the 
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class of languages defined by the fragment T closed under shuffle?" In classical Ehrenfeucht- 
Frai'sse games, a winning condition for Duplicator relies on isomorphisms between game 
situations. We conjecture that asymmetric winning conditions as in the proof of Theorem [2] 
can be used to give combinatorial counterparts for arbitrary fragments. 
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Appendix 



A. Formal Syntax and Semantics of Monadic-Second Order 
Logic with Modular Quantifiers 

Let A be a countable universe of labels. The syntax of a formula if is given by 

T I _L I empty | \{x) =a \ x G X \ 
X = y I X <y I X <y \ suc(a;, y) \ min(a;) | max{x) \ x = r (mod q) \ 
^ip \ ipi V ip2 \ ipi A \ if \ \/x if \ 3Xip I VXif I T'"°'*«a;(^ 

for x,y € Yi, X G Y2, a E A, r,q E and formulae tp, V"- Note that we do not impose a 
finiteness condition on A hut of course for every formula only a finite subset of A occurs. 
Wc stipulate the usual shortcuts V"=i 'Pi ^ii^ A"=i Vi ^'^^ fornndac ifii, i G {1. . . . with 
the convention that for n = the disjunction is ± and the conjunction is T. Moreover, if 
^ is a finite subset of A, then A(a;) e ^ is an abbreviation for the formula VaeA •^(^) ~ ^■ 
Parentheses may be used for disambiguation and to increase readability. 

Next, we give the formal semantics of formulae. Even though we are mainly interested in 
sentences, we need to handle formulae with free variables in the definition of the semantics. 
This is done by extending the alphabet in such a way that the interpretation of free the 
variables can be encoded. For a formula ip and a set of variables V containing the free 
variables of tp, the semantics is a subset of words over A x 2^ and denoted by Ifjv- For 
an alphabet A C A we let |</3]yi,y = l^pjv n (A x 2^)* be the semantics over A. The 
idea is that the second component allows to read of the interpretation of the free variables. 
Suppose a position is labeled with (a, J). Then a first-order variable x is at this position 
if and only if a; G J and the second-order variable X contains this position if and only if 
X € J. Let w = (ui, Ji) • • • (a„, J„) where n > 0, ttj G A and Ji C V. Then w G |T]y 
if and only if for all a; G V fi Vi there is exactly one index i G {1, . . . ,n} with x & Ji. 
Notice that 1 ^ |T]v if V contains a first-order variable but 1 G [T]0. We are going to 
define the formal semantics such that \'p\v ^ |T]]y. If w G |T|y, then the interpretation 
oiXGVonw'm the set of positions X{w) = {i G {1, . . . ,n} | X G Ji}. Notice that x{w) 
is a singleton set for every first-order variable x and by abuse of notation we also write 
x{w) for the position contained in this singleton. We extend the label function to first-order 
variables by setting A„(a;) = Xw{x{wj). Suppose w G [TJ^. Let l-Ljv = 0- For "empty" 
let w G |empty]|y if and only if \w\ = 0. For the label predicate let w G [A(a;) = ajv if 
and only if Xw{x) G {a} x 2^. For the containment predicate let w G fx G XJy if and 
only if x{w) G X{w). For the predicate ~ G {=,<,<} let w G |a; ~ y|v if and only if 
x{w) ^ y{w). For the successor let w G |suc(a;,y)|y if and only if x{w) -I- 1 = y{w). Let 
w G [[min(a;)]y if and only of x{w) = 1 and let w G |max(a;)]y if and only if x{w) = \w\. For 
the modular predicate let tz; G |a: = r (mod q)lv if and only if x{w) = r (mod q). Boolean 
combinations are given inductively by [-'(/^ly = lT]y\[[((5|y and Ji^i V (p2iv = l'pi}v^l'p2iv 
and l(pi A (p2lv = ["^ily H [(/Jgly- For the semantics of the first-order quantifiers we need 
to introduce some more notation. Let w[x/i] = {ai,J{) ■ ■ ■ {an, J^) with J- = Jj U {x} and 
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Jj = Jj \ {x} for all j ^ i. Let 

[3a; tp\y ^ [w e [[T]y | G Mvui^;} for some i £ {1, . . . , |w;|}} . 

l3Xip\v = {(ai, Ji)---(a„, J„) e [T]y | (ai, J^) • • • (a„, J/J £ Myu{Jf} 
for some J(, . . . , 4 with Ji A J'^ C {X}}. 

Here J A J' = ( J \ J') U (J' \ J) is the symmetric difference. 

For a formula (/3 and a first-order variable a: let — {i € {1, . . . ,\w\} \ w[x/i] G |<y5]]vy{a;}} 
be the set of positions iofw such that tp holds if x is on position i. For the modular quantifier 
let 

"-'I ^x^iv = {welTlv\\lU^,^)\^r (mod g)} 

For the universal quantifiers let |Va; (p}v — l^^x ^(fi^v and [VX (fijv = -if^Jy. Note 

that if FV((^) ^ V, then the semantics is undefined. Also take notice that in case 

(7 = the modular predicate degenerates to equality and the modular counting quantifier 
counts the exact number of positions, i.e., for w £ |T|y we have w £ \x = r (mod 0)]\/ if 
and only if x{w) = r and w S p'' ^x (p^v if and only if \Iw{x, f ) \ — r. 



B. Proof of Lemma [T] 

In order to prove Lemma [1] we need the following lemma which a substitution principle for 
fragments. It states that if some subformula is replaced by a <jr-smaller formula, then the 
result is again <_F-smaller. 

Lemma 2 Let J- be a fragment. If <jr ijj, then ^{ip) <jr v{i]j) for every context v. 

Proof: The proof is by induction on the structure of v. Let ip <jr V and suppose ^{^{tp)) G J^. 
We want to show ii{v{p)) E F. If i/ = o, then = iJ,{ip). Hence, by defini- 

tion, fi^iy^ip)) = ^i{ip) £ T. li V is another atomic formula, then v{p) = v{ip) and 
the claim becomes trivial. Suppose that v — -^v' ox v — ^ v' for some quantifier Q G 
{3a;, Vx, 3'' '""'^ '^x, 3X, VX | a; G Vi, X G V2, r, g G Z} and some context v' and consider the 
primitive context /i' = /i(~io) (respectively, [i! — /i(Qo)). Inductively, v' {ip) <jr ly'iijj). We 
have i^(V') — {v' G F and thus v{^p) — ij!{v'{(p)) G J-. Finally suppose v = v[ y V2 
for some contexts v'^. Axiom (0) yields G F. By induction ^^{(p) <jr I'li'ip) and 

therefore, iJL{u[{ip)) G J-. By the same axiom ^{v{ip)) — ^{v[{ip) V In case 

V = v[ f\v'2 the claim follows analogously. □ 

Lemma [Tl If J- is a fragment, then ip <jr -0 if and only if pi <jr ij). 

Proof: The implication ip <jr ip ^ (p <jr ijj being trivial, it suffices to show the reverse 
implication. Suppose vO') G J- for a context v. Let fi be the primitive context obtained 
from 1/ by replacing all label predicates by T. Repeated application of Lemma [H shows 
1^(4') and hence /i(V') G F. With ip <jr ip we see fi{ip) G F. Again with Lemma [51 

we see I'iip) /^(<^) and hence ^{ip) G F. □ 
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C. Proof of Proposition H 



In order to proof Proposition [TJ we give a construction for the formula defining the residuah 
We concentrate here on left residuals by letters; there are dual statements for right residuals. 
At the end of this section we make those statements for right residuals explicit that are not 
straightforward symmetric versions of the statements for left residuals. Residuals by words 
are obtained by repeated residuals by letters. 

Intermediately we need to handle formulae with free variables, even though the latter 
application will be to sentences. We therefore give a more general construction for the 
residuals over the extended alphabet A x 2^ used to encode the value of the free variables. 
The intuition of the construction is as follows: Given a model w, we want to evaluate if 
on the word (a, J)w. The idea is to handle the "phantom" (a, J)-position in front of w 
syntactically using the set J for bookkeeping purposes. 

We tried to keep the construction flexible. Due to different premises, this leads to a rel- 
atively high number of lemmas (four for the atomic formulae, one for Boolean connectives, 
one for first-order and second-order quantification, respectively, and one for modular quan- 
tification). However, they are all of a very similar structure. Plugging the lemmas of this 
section into an easy induction yields for every formula ip another formula {a,J)^^(p such 
that 

1. {a,J)~^(p <jr cp for all "appropriate" fragments and 

2. i{a,J)-'^jv' = {a,Jm^jv 

for all sets of variables V with J U FY{ip) C V and V ^ V \ {J n Vi). 

The first property is of syntactic nature; it notably yields that (a, J)^^ip is in J-' whenever ip 
is. Here, an "appropriate" fragment is a fragment which, depending on the predicates 
occurring in ip, potentially has some additional closure properties. In the lemmas below, ^ 
will denote a family of appropriate fragments. Note that we have one formula for all such 
fragments (and not for every fragment a different formula), which is a stronger result than 
actually needed for the closure under left residuals of a fixed fragment. 

The second property gives semantic correctness, i.e., {a,J)~^ip actually defines the left 
residual of (p. Note that for (a, J)~^ to makes sense, V has to contain all free variables 
of (p and all variables of J. Also note that since [(a, J)~^ipjv' is defined, in particular no 
first-order variable in J can appear freely in (a, J)~^ip. 

The following lemmas give formulae for the left residual of languages defined by one of the 
atomic formulae. Lemma [3] deals with the formulae T, _L, empty, min(a;), A(x) — b, x = y, 
X < y, X < y and x £ X for which the closure properties of a fragment suffice. Lemma 2] 
and Lemma [5] are for the successor predicate sue (a;,?/) and for max(a;), respectively. Our 
construction for s\xc{x,y) relies on being able to replace suc(x,?/) by min(y) thus restricting 
the class of appropriate fragments. For max(x) the fragment must allow to replace max(a;) 
by empty. Lemma [H] finally gives the construction for the modular predicate x = r (mod q) 
where we have to be able to change the remainder parameter r. 

Lemma 3 Let ip he one of the atomic formulae T, _L, X{x) — b, x = y, x < y, x < y, 
empty, min(a;) or x G X . Then for all a £ A and all sets of variables J there exists a 
formula {a,J)^^ip which satisfies 

(a, J)"^(y9 <jr p for all fragments J- and 
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where V is a set of variables with J U FV(y) C V and V = V \ {J CiYi). 

Proof: In the following w denotes some word over A x 2^ . For the formulae T, ± and empty 
we let (a, J)~^T = T and (a. J)^^± = _L and (a, J)^^cmpty = ±. Note that there exists no 
w such that (a, J)w is empty which estabhshes the correctness of (a, J)~^empty. 

Suppose X G J. Then x{{a, J)w) = 1 and X{a,j)w{3:) = 6 if and only if a = 6; for x ^ J we 
have X(a,j)w{x) = Xw{x)- Hence the label predicate is given by 



{a,J)-'^{X{x) =b) = < 



\{x) = b \i X ^ J, 

T if a; G J and a ■ 

_L else. 



Wc next consider the equahty predicated and the two order predicates. We have a;((a, J)w) - 
y{{a, J)w) if and only if either x,y £ J ot x,y ^ J and x{w) = y{w). We have x((a, J)w) < 
y((a, J)w) if and only if either x £ J and y ^ J oi x,y ^ J and x{w) < y{w). For the 
non-strict order we have x{{a, J)w) < y{{a, J)w) if and only if either x G J or x,y ^ J and 
x{w) < y{w). We therefore let 



(a, J) ^{x = y) 



\i X ^ J and y ^ J, 
if a; € J and y € J, 
else, 



(a, J) \x<y) = < 



X < y li X ^ J and y ^ J , 
T if a; S J and y ^ J, 
_L else, 



(a,J)-i(a;<y) = < 



X < y \i X ^ J and y ^ J, 
T if a; e J, 
_L else. 



The formula min(a;) is true over (a, J)w, i.e., a;((a, J)w) = 1, if and only if a; e J. Thus 

(a, J)~^ (min(a;)) = 



T if a: e J, 
± else. 



Finally consider the second-order predicate x G X. Suppose x & X. Then x G X{{a, J)w) 
if and only if X e J. For x ^ J we have x G X{{a, J)w) if and only if a; G X{w). Therefore 
let 

{X € X if X ^ J. 
T if X G J and X G J, 

± else. 

It is easy to see, that all these formulae satisfy the syntactic property {a,J)~^ip <jr ip for 
all fragments J". □ 
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Lemma 4 Consider the atomic formula suc{x,y) for some x,y G Vi. Let ^ be a family of 

fragments T such that min(j/) <jr suc(x, y). Then for all a G A and all sets of variables J 
there exists a formula (a, J)^^suc{x,y) which satisfies 

(a, J)^^suc(.T, y) <jr suc{x,y) for all T & ^ and 
l{a,J)~^suc{x,y)}v' = (a, J)"^|suc(a;,y)]y 

where V is a set of variables with J U {a;, y} C y and V = V \ {J n Vi). 

Proof: Let w be a word over A X 2^ and suppose X € J. Then x{{a, J)w) + 1 = 2/((a, J)w) 
if and only if y((a, J)w) = 2 if and only \i y ^ J and y{w) — 1. Suppose x ^ J. Then 
x{{a, J)w) + 1 = y{{a, J)w) if and only if y ^ J and x{w) + 1 = y{w). Thus 



This formula satisfies the syntactic property (a, J) -^suc(a;, y) <jr suc(a;, y) for all 6 5^. □ 

Lemma 5 Consider the atomic formula max(a;) for some x G Vi. Let ^ be a family of 
fragments T such that empty <jr max(.7;). Then for all a G A and all sets of variables J 
there exists a formula (a, J)^^max{x) which satisfies 

(a, J)^^max(3;) <jr max(x) for all T € ^ and 
[(a, J)~"^max(x)]]y/ = (a, J)^^ |max(.x)]v 

where V is a set of variables with J U {x} C V and y = ^ \ ( J n Vi). 

Proof: Lot w bo a word over A x 2^ and suppose x £ J. Then we have x{{a, J)w) = 1 = 
|(a, J)w\ if and only if l^l = 0. For x ^ J we have x{{a, J)w) = x{w) + 1. Thus 



This formula satisfies the syntactic property (a, J) ^max(a;) <jr max(a;) for all T □ 

Lemma 6 Consider the atomic formula x = r (mod q) for some x gYi and some q,r £Z. 
Let ^ be a family of fragments T such that x = r (mod q) and x = s (mod q) are T- 
equivalent for all s € Z. Then for all a € A and all sets of variables J there exists a formula 
(a, J)~^ [x = r (mod q)) satisfying 

(a, J)~^ (x = r (mod q)) <jr {x = r (mod q)) for all and 
l{a,J)-'{x = r{modq))jv' = {a, jy^x = r {mod q))jv 

where V is a set of variables with J U {a;} C V and V = V \ {J Ci Vi). 



(a, J) ^suc(a;,y) 



suc(x, y) ii X ^ J and y ^ J, 

< min(y) if x S J and y ^ J, 

± else. 

v 
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Proof: Let w be a word over A x 2^ and suppose x E J. Then x{{a, J)'w) = 1. Suppose 
X ^ J . Then x{{a, J)w) = 1 + x{w) and hence x{{a, J)'w) = r (mod q) if and only if 
x{w) = r ~ 1 (mod q). Thus 

{X = r ~ 1 (mod q) if x ^ J, 
T a X e J and r = 1 (mod q), 

_L else. 

This formulae satisfies the syntactic property. □ 

The next lemmas "lift" the construction to formulae composed by Boolean combinations 
and quantifiers. These lemmas state that if there are formulae defining the left residual 
which work for all fragments in "S, then there exist formulae defining the left residual of the 
Boolean combinations (Lemma [7]) , the first- and the second-order quantification (Lemma |8] 
and Lemma IHl respectively) which also work for all fragments in ^. Moreover, there also 
exists a formula defining the left residual for a formula involving the modular counting quan- 
tifier which works for all fragments in ^ with some additional closure property (Lemma llOp . 

Lemma 7 Let he one of the formulae -npi or ipi V or ipi A <^2 • Let ^ be a family of 
fragments. Suppose for all a E A and all sets of variables J there exists a formula (a, J)~^ipi, 
i e {1,2}, which satisfy 

{a,J)~^(pi <jr ip^ for all E ^ and 
[(a,J)-V.lv' - {a,J)-'y4v 

where V is a set of variables J U FY{ipi) C V and V' ~ V \ {J H Vi). Then for all a E A 
and all sets of variables J there exists a formula (a, J)~^ip which satisfies 

(a, J)~^ip <jr ip for all T E d and 

[(a,J)-V^ly' - ia,jr'mv 

where V is a set of variables with J U FV(V') ^ V and V' — V \ {J D Vi). 
Proof: The constructions for positive Boolean connectives are: 

ia,jy\ipiV ip2) = ((a, J)-Vi) V ((a, J)-V2), 
(a, J)-i((^i A^2) = ((a, J)-Vi) A ((a, J)-V2)- 

Let fi he SL primitive context, let 7^ G 5^ and suppose fJ.{(pi V ip2) E J-. Since is a 
fragment we see /i('Pi) E T (for i E {1,2}). By assumption /i((a, J)^^(^i) E T and finally 
fi(^{a, J)~^LPi V (a, J)^-^(^j) E T. This shows (a, J)~-^((/3i V kp2) <t {'^i V v?2)- Analogously 
(a, J)'^{lpi a 1P2) ifi A ip2)- For the negation let 

(a,J)-i(-^i) = -((a,J)-Vi), 

Suppose fi{^ipi) E T for some primitive context /i. Thus /i'((pi) E T for the primitive 
context /i' = /i(^o). Now, by assumption /^'((a, J)^"'^(pi) = /i(^(a, J)^"'^(^i) E T. This 
shows (a, J)~^(-i(pi) <jr ^<~p\. The semantic correctness is easily verified for all Boolean 
connectives. □ 
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Lemma 8 Consider the formulae 3x ip and Va; for some a; G Vi . Let ^ be a family of 
fragments. Suppose for all a £ A and all sets of variables J there exists a formula {a, J)~^^p 
which satisfies 

{a,J)~^ip <jr ip for all E d and 

I(a,J)-Vk' = {a,Jr'Mv 

where V is a set of variables with JUFY{ip) C V and V = F \ (Jn Vi). Then for alia G A 
and all sets of variables J there exist formulae {a, J)~^ 3x ip and (a, J)~^ Vx p which satisfy 

(a, J)-i 3x p <:f 3xp for all Te^, I(a, J)-^ 3x pjv' = (a, J)'^ l^x ipjy, 
{a, J)^^ V.T p <jr \Jx p for all T E^, [[(a, J)^^ Vx p\v' — (a, J)^^ \ix (p\v 

where V is a set of variables with JUFV(3a;(^) = JUFV(Va; ip) <^V andV = y\(JnVi). 

Proof: Let whe& word over A x 2^ and let w' = (o, J)w. We only argue for the existential 
quantifier; the universal quantifier is analogue. Suppose w'[x/i] G |</']yu{a;} for some i. Sup- 
pose i = 1 and w'[x/i] — (a, JLi{x})w". Note that w" is essentially w but x is removed from 
all second components. Then w' G [vlyulx} is equivalent to w" € |(a, J U 
by assumption. This in turn is equivalent to w G |(a, J U {a;})~^(pjy' since, in particular, 
X ^ FV((a, J U {x})~^ip) is not a free variable of (a, JU {x})~^(p and consequently its truth 
value does not depend on the value of x. 

Let i > 2 and let w'[x/i] = {a,J\ {x})w". Notice w" = w[x/i — 1]. By assumption 
w'[x/i] G [</5lyu{a;} is equivalent to w" G \{aTJ\{x})~^ip\v'\{x}- These considerations 
allow to set (with ipi = (a, J U {x})~^ip and <^2 = {a,J\ {x})~^ip) 

{a,J)~^3xip = ip\\/3xip2, 
{a,J)~^\/xip = ip\ AMx ip2- 

We now show the syntactic property. Let G i?. Wc have pi <jr 3x pi <jr 3x ip 
by assumption on ip; note that x ^ FV((pi). Together with 3x ipi <^ 3x ip this yields 
(o, J)~^3x ip <j: 3x ip. The argument for the universal quantifier is analogue. □ 

Lemma 9 Consider the formulae 3X ip and VX tp for some X G V2 . Let ^ be a family of 
fragments. Suppose for alia G A and all sets of variables X there exists a formula (a, J)~^ip 
which satisfies 

(a, J)~^ip <jr ip for all T & and 
[(a,J)-Vly' = (a,J)-^My 

where V is a set of variables with J U FV(<^) C V and V = V \ {J CiYi). Then for all 
a E A and all sets of variables J there exist formulae (a, J)~^ 3X ip and (a, J)~^ MX ip which 
satisfy 

{a, J)-^ 3X ip <r 3X ip for all Fe^, {{a, J)"^ 3X ip)lv' = {a, J)~^13X iplv , 
{a,J)-^MXip<rMXip for all J- ed, {{a, MX ip)lv' = {a, jy^fMX pjv 

where V is a set of variables with JLlFV{3Xip) = JuFV(VX(p) C V and V = y\(JnVi). 
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Proof: Let w = (ai, Ji) • • • (a„, J„) and let w' = {a,J)w where e A and Jj C V'. 

We only argue for the existential quantifier; the universal quantifier is analogue. Suppose 
(a, J'){ai,J{) ■ ■ ■ {an, J'J G Myu{Js:} for some J' and J| with J' A J C {X} and J| A Jj C 
{X}. Now, either X G J' oi X ^ J'. The premise for ip yields (ai, J{) • ■ • (a„, J^) G 
l{a, J U {^})"Vlyu{Jf} in the first case (and (ai, J[) ■ ■ ■ {an, J'n) G |(a, J \ {X})" Vlyu{x} 
in the second case, respectively). Therefore we define 

{a,J)-^3X^ = 3X((a,JU{X})-Vv (a,J\{X})-V), 
(a,J)-WX<^ = VX((a, JU{X})-V A (a, J\{X})-V)- 

Next, we show the syntactic property. Let be a fragment in ^. The assumption on 
if yields {a, J U {X})-^ip <jr if and (a, J \ {X})-^ip <jr tp. Thus {a, J U {X})-V V 
(a, J \ <^ and finally {a, J)~^ 3X (p <jr 3X (p. The argument for the universal 

quantifier is analogue. □ 

The following lemma lifts the residual construction to the modular counting quantifier. It 
in particular applies to mod-stable fragments but is formulated slightly more general. 

Lemma 10 Consider the formula 3'^ '^x (f for some x gYi and some r,q G Z. Let 5 be 
a family of fragments such that the formulae 3^ (p and 3* (p are T-equivalent 

for all s el and such that tp 3"^ ""^^ ^x ip and <F 3*^ '^x ip for all ip <jr p with 
X ^ FV(^). Suppose for alia G A and all sets of variables J there exists a formula {a,J)~^(p 
which satisfies 

{a,J)^^(p <jr p for all G d and 
[(a,J)-Vlv' = {a,J)-'Mv 

where V is a set of variables with JUFV(<^) C V andV = y\(JnVi). Then for all a G A 
and all sets of variables J there exists a formula {a, J)~^ {3'^ *x ip) which satisfies 

{a, jy^ {r "x ip) <^ 3'' '""^ ip for allied and 
l{a,J)-y3^'^°^'^xp)iv' = (a,J)-ip'--°d''x<^ly 

where V is a set of variables with J U FV(3'^ '"°<^ ^xip) CV and V = V \ {J DYi). 

Proof: Let (pi = {a,J[J{x})~^(p and (p2 = {a,J\{x})~^(p be the formulae from the premise. 
Let 

(a, J)-^3'- «x <^ = {ipi A 3'--'^ """^ "x ip2) V (-tpi A 3"^ "x (^2) • 

Suppose wc are given a model w. The formula realizes a straightforward case distinction: 
Either the first position of (a, J)w is a (p-position and then the number of (^-positions in 
the factor w has to be r — 1 (modulo q), or it is not a (/^-position and then the number 
of (^-positions in w is r (modulo q). Here, for conciseness wc say that a position i of 
hs{w) is a (^-position, if {hs{w))[x/i] G [<^lyu{a:}) <P is true if x is interpreted by 
the position i. Given the closure properties of a fragment G ^, it is easy to see that 
{a, J)-'^3'- """^ ix ip 3-" ix ip is inherited from and ^2. Notice x ^ FV(<^i). □ 
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Right Residuals. There are of course dual statements providing us with formulae J)^^ 
defining the right residual. We shall only make those explicit where some attention has to 
be paid for the premises. 

Lemma 11 Consider the atomic formula suc{x,y) for some x,y gYi. Let ^ be a family 
of fragments J- such that max(a;) <jr suc{x,y). Then for all a G A and all sets of variables 
J there exists a formula suc{x,y){a, J)^^ which satisfies 

suc(a;, y)(a, J)^^ <jr suc(a;, y) for all T G^ and 
liinc{x, y){a,jy^v' = [suc(x, (a, J)"^ 

where V is a set of variables with J U {a;, J/} C 1/ and V' = V \ {J CiYi). 

Proof: Let w be a word over A x 2^ and suppose y £ J. Then x{w{a, J)) + 1 = y{w{a, J)) = 
\'w{a, J) I if and only if x{w) ~ \w\. Suppose now y ^ J, then x{{a, J)w) + 1 = J)w) if 
and only if x ^ J and x{w) + 1 = y{w). Let thus 

{suc(a;, y) \i x ^ J and y ^ J, 

max(x) if a; ^ J and y £ J, 

1. else. 

This formula satisfies the syntactic property. □ 

Lemma 12 Consider the atomic formula min(a;) for some x E Yi. Let ^ be a family of 
fragments T such that empty <jr min(a;). Then for all a A and all sets of variables J 
there exists a formula min(x)(a, J)~^ which satisfies 

min(2;)(a, J)~^ <jf min(a;) for all J- (z ^ and 
[min(a;)(a, J)"^]]y/ = [min(x)]ly (a, J)"^ 

where V is a set of variables with J U {a;} C V and V = V \{J r\Yi). 

Proof: Let w be a word over A X 2^ and suppose X J. Then x{vu{a, J)) — \w{a, J)\ and 
consequently x{w{a, J)) = 1 if and only if \w\ =0. If x ^ J, then x{w{a, J)) — x{w). Let 
therefore 



(a, J) ^min(x) 



min(a;) if a; ^ J, 
empty else. 



This formula satisfies the syntactic property. □ 

We are now ready to show Proposition [T] 

Proposition [TJ Let J- be a fragment and suppose that T is suc-stable and mod-stable. Then 
the class of languages defined by J- is closed under residuals. 

Proof: We show closure under left residuals. By induction on the structure of ip we see 
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that for for all a G A and all sets of variables J there exists a formula {a,J)^^(p which 
satisfies (a, J)^^(p <jr (p and |(a, J)^^Lp}v' = {a, J)^^ Wiv where ^ is a set of variables with 
JUFV((^) C V and V = V\{JrNi). For the atomic modalities T, ±, A(a;) ^ a, x = y, x < y, 
X <y, empty, min(a;) and x ^ X this is Lemma [3] Let ^ = {T}. The predicates suc(a;,?/), 
max(a;) and x = r (mod q) are Lemma SI Lemma [5] and Lemma [51 respectively. Note that 
in all cases meets the requirements of the respective lemmas. For Boolean connectives, 
first-order quantification, second-order quantification and modular quantification, the claim 
follows by induction and Lemma [71 Lemma [51 Lemma [9l and Lemma ITOl respectively. Now, 
using the claim and setting a^^ip = (a. 0)^^(p yields a^^LAif) = LA{a^^f) for every finite 
alphabet A C A and a^^ip <jr ip. In particular, if (/s G J-", then a^^ip G that is, if 
L G £a{J^), then also a^^L G £a{J^)- Closure of Ca{^) under right residuals follows 
symmetrically. □ 



D. Proof of Proposition [21 

In order to proof Proposition [2l we give a construction for the formula defining the inverse 
morphic image. More specifically, let A,B Q K he finite alphabets. For a morphism h : 
B* — !■ A* and a formula Lp we construct a formula h^^{ip) which, interpreted over w, has 
the same truth value as Lp interpreted over h{w). In addition, h~^{ip) meets the syntactic 
property of being not "more complicated" than tp in a certain sense. 

As for residuals the application will mainly be to sentences, but we intermediately need to 
handle formulae with free variables. We need some more notation to formulate this concisely. 
Let h : B* ^* be a morphism, let w = bi ■ ■ ■ bm for bi ^ B and suppose h{w) = oi • ■ • a„ 
for ai G A. Then for every i G {1, . . . , n} there exist unique numbers j G {1, . . . , m} and 
d G {1, . . . , |/i(6:/)|} such that |ai • ■ • Oi| = \h{bi ■ ■ ■ bj-i)\ + d. The numbers {j, d) are called 
h- coordinates on w of the position i of h{w); the number j basically identifies the position 
of w where i originates from and d is the offset within the image of bi. See also Figure [T] for 
an illustration. Note that if B is finite, then max^gs \h-{b)\ is a well-defined upper bound 
for d. 



bi 


62 


bz 


bi 


65 


• ■ • bj ■■■ bm 


\ 

1 2'- 


^12 3 l\ \ J d 


a\ 


02 


0-3 


04 


05 


ae • • • • • • «i ...... a„ 



Figure 1: The /i-coordinates of hiw). In this example, h{bi) = 0102, h(b2) — h{bz) ~ 1, 
h{b4) = 030405 and h{bcf) = og; position 5, for example, has /i-coordinates (4,3). 

The idea is encode the variables of h{'w) in the alphabet of w. Let i be a position of h{'w) 
with /i-coordinates (j, d). A first-order variable is encoded by the ^.-coordinates (j, d). A 
second-order variable X is distributed over several second-order variables Xi in such a way 
that X contains the position i of h{'w) if Xd contains the position j of w. To formalize this 
we first introduce for every set of variables V a derived set of variables Vn with the same 
first-order variables as V such that for every second-order variable X there are n distinct 
variables Xi, . . . , X„. If now i5 : Vi -^p N is a partial function mapping a first-order variable 
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to its offset, then the above encoding is reahzed by the morphism : (5x2^")* -5- {Ax2^)* 
given by the following definition. 

Definition 4 Let V be a set of variables and let n £ N. Then Vn is some fixed minimal 
set of variables with Ki fl Vi — V HYi and for every second-order variable X G V we have 
X G Vn and there exist distinct second-order variables X — Xi, . . . , Xn S Vn such that 
{Xi , . . . , Xn} and {Yi ,...,¥„} are disjoint for X . 

Let h : B* A* be a morphism, let 5 : Vi — >p N and let n Cz N. Let the morphism 
hs : {B X 2^")* {Ax 2^)* be given by h{b) = (ai, Ji) • • • (a^, Ji) if h(b) = ai ■ ■ ■ ai and 

1. X £ Ji if and only if x Cz J and S{x) = i for all first-order variables x and 

2. X E Ji if and only if Xi E J for all second-order variables X. O 

Note that in order to avoid an all to tedious notation, the parameter n is understood 
implicitly in kg- 

In this section we are going to give for every formula ip and every "appropriate" homo- 
morphism kg, a formula hj^{ip) such that 

1. hj^{tp) <jr (fi for all "appropriate" fragments T, and 

2- lhs\'p)jB,V^ = hg\MA.y) 
where V is a set of variables with FV{(p) C V. 

By the first, syntactic property in particular hj^{(p) £ J- if tp E J-. What is appropriate 
depends on the predicates used by the formula. The second property is semantic correct- 
ness, i.e., hg^{ip) indeed defines the inverse image of kg; again "appropriate" depends on 
the predicates of the formula. Note that the statement is stronger than just closure of a frag- 
ment under inverse morphisms because we get one formula which works for all appropriate 
fragments. Also note that for first-order formulae we may choose V QYi and thus Vn = V. 

For the atomic formulae these are the following five lemmas. Lemma [T3] gives the con- 
struction for T, ±, A(a;) = a and x = y. Lemma 1 141 is for x < y and x <y\ Lemma [T51 is for 
suc(a;, y), min(a;), max(x) and empty; Lemma fT6l is for x = r (mod q); and Lemma [TTl is for 
the second-order predicate x £ X . Subsequently, we give lifting arguments for Boolean com- 
binations f Lemma II 8p . first-order, second-order and modular quantification (respectively. 
Lemma [111 Lemma HD] and Lemma [^T|l . 

The lemmas in this section can be seen as a toolbox for the closure under inverse mor- 
phisms of which one needs to consider only those lemmas which are relevant in a given 
situation. These are then connected by an easy induction. For example, for a fragment of 
first-order logic without modular quantifiers using only the order predicates, it suffices to 
consider Lemma [T51 for true, false and label. Lemma [Til for the order. Lemma [T51 for Boolean 
connectives and Lemma [TOl for first-order quantification. 

Lemma 13 Let Lp be one of the atomic formulae T, ±, A(a;) = a or x — y and let V be a 
set of variables with FY(ip) C V . Then for all morphisms h : B* — !■ A* and all 6 : Vi N 
there exists a formula hg^{(p) which for all fragments J- and all n > max^gs \h{b)\ satisfies 

hj^{ip) <jr tp and 
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Proof: We have hsiw) £ IT}a,v if S{x) is defined and if 1 < S{x) < \hs{Xw{x))\ for all 
first-order variables x € V. Therefore, if S{x) is undefined for some x G V DWi, then we let 
hj\T) = _L. Else we let 

xevnVi 

where Bi = {b £ B \ 1 < i < \h{b)\} is the set of letters b £ B such that « is a position of 
h{b). By the above considerations we see [/i5'^(T)|b_v'„ = ^5'^([T]a,v)- Note that if h is 
length-multiplying, then we could also set hJ^{T) = T if S{x) is defined and in {!,... ,m} 
and hJ^{T) = _L otherwise; here m = \h{b)\ for some b E B. Consider a position i of h{w) 
with /i-coordinates {j,d). Then A^(^,)(«) = a is equivalent to A^(f,)(c?) = a where b = Xw{j)- 
Let C ~ {b E B \ A^(f,) = a} be the set of letters b € B such that h{b) has label a at 

position 5{x). Let i' be a position of h{'w) with ^.-coordinates {j',d'). Then i = i' if and 
only if j — j' and d — d'. We therefore let 

;i-i(A(2;) =a) = h-^^{T) A\{x) eC, 

= y if 5(2:) = (5 (y). 



else. 



It is easy to see, that these formulae satisfy the syntactic property ^(</j) <jr ip for all 
fragments □ 

Lemma 14 Let x y be one 0/ t/ie atomic formulae x < y or x < y for some x,y (z Wi. 
Let V be a set of variables with FV(a:: ^ y) Q V . Then for all morphisms h : B* — ^ A* and 
all S : Vi — >p N there exists a formula hj^{x < y) which for all order-stable fragments T 
and all n > maxf,gs \h(b)\ satisfies 

hs^{x <y) <jr {x <y) and 

<y)lB,v^ = h^Wx <y\Ay). 

Moreover, if h is length-reducing, then hg^{x < y) <jr (x < y) for all fragments T . 

Proof: Consider positions i and i' of h(w) with /i-coordinates {j,d) and {j',d'), respectively. 
Suppose h is length-reducing. Then d = d' = 1 and hence z < i' if and only if j < j' 
and i < i' if and only if j < j' . Thus we let hj^{x < y) = ihJ^{T) A x < y) and 
hj^ {x < y) — {h'g^{T) A X < y) where hJ^{T) is the formula from Lemma [T51 for the set V. 
It is easy to see that these formulae satisfies the syntactic property hg^{x < y) <jr (x < y) 
and h'^^[x < y) <jr {x < y) for all fragments F. 

Suppose now h is not length-reducing. Then ? < i' if j < j' or if j < j' and d < d'; and 
i < i' ii j < j' or if j < j' and d < d' . Let therefore 



hs\x<y) = hj\T)A 




ii5[x)<5{y), 
else, 



h^\x<y) . hfi-DA^zl els?^-'^'^' 

It is easy to verify that these formulae satisfies the syntactic property hj^{x < y) <jr [x < y) 
and hj^{x < y) <jr {x < y) for all order-stable fragments T. □ 
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Lemma 15 Let ip be one of the atomic formulae suc{x,y), min(a;), max(a;) or empty and 

let V be a set of variables with FV(i^) C V. Then for all non-erasing morphism h : B* — > A* 
and all S : Vi -^p N there exists a formula hj^{(p) which for all fragments T and all 
n > maxftgB \h{b)\ satisfies 



hg ^((^) i-p and 

Proof: Suppose h is non-erasing. Clearly, h{w) is empty only if w is empty. Consider a 
position i of h{w) with /i-coordinates {j, d). Then i = 1 is equivalent to j = d=l; note that 
for non-erasing morphisms we may have i = 1 but nonetheless j > 1. We therefore let 



hg ^ (empty) 

hj^{min{x)) 



empty, 

jhg\T) A min(a;) if 6{x) = 1, 



_L 



else. 



For the max predicate we observe that i = \h{w)\ if j = \w\ and 5 = \h{b)\ where b = A^(j); 
note that for erasing morphisms we may have i = \h{w)\ but nonetheless j < \w\. For sue 
consider positions i and i' of h{w) with /i-coordinates (j, d) and {j' , d'), respectively. Suppose 
i + 1 = i'. We consider two cases. If d' = 1, then necessarily j + 1 = j' and d = \h{b)\ where 
b = Xw{j)] otherwise j = j' and d + 1 = d' . Note that for erasing morphisms we may have 
i + 1 = ihut j + 1 < j. Now, let C = {b€ B \ 6{x) = \h{b)\} be the set of labels, for which 
^(a;) is the maximum position in the image under h. Then we let 



hg ^(max(a:)) 
hj'^{suc{x,y)) 



hg ^(T) A A(a;) e C A max(x) 

'hJ^{T) A suc(a;,y) A A(a;) e C if S{y) = 1, 

hJ^{T) Ax = y elseif (5(a;) + l = ^(j/), 

_L else. 



It is easy to see, that these formulae satisfy the syntactic property hg ^ {ip) <jr ip for all 
fragments □ 



Lemma 16 Consider the atomic formula x = r (mod q) for some a; G Vi and some r,q € l^. 
Let ^ be a family of fragments T such that x = r (mod q) and x = ,s (mod q) are T- 
equivalent for all s £ Z. Let V be a set of variables with FV(a: = r (mod q)) C V . Then for 
all length-multiplying morphisms h : B* ^ A* and all 6 : Vi -^p N there exists a formula 
hj^{x = r (mod q)) which for all and all n > \h{b)\ for b £ B satisfies 

hj^(^x = r (mod q)) <jr (a; = r (mod g)) and 

{hj^ {x = r (mod q))jB,v„ = hj'^{lx = r (mod g)]A,y). 

Moreover, if h is length-preserving, then hj^[x = r (mod q)) <jr (x = r (mod q)) for all 
fragments T . 
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Proof: Let h be length-multiplying and let m = |^(^)| for some b E B. If m = or if S(y) 
is undefined for some y £ n Vi, then hj^{x = r (mod q)) = _L. Let i be a position of 
h{w) with ^.-coordinates {j,d). Since h is length-multiplying, we have i = m{j — 1) -I- d. 
Let t = gcd(q, m) be the greatest common divisor of q and m. Let q = pt^ m — It and let 
r' = r + m — d. Then i = r (mod q) if and only if mj = r' (mod q). Hence, t is a divisor of 
r' and mj = r' (mod q) is equivalent to ij = r' /t (mod p). Since gcd(^,p) = 1 there exists 
a number such that £~-^£ = 1 (mod p) and £j = r' /t (mod p) if and only if j = t^^r' /t 
(mod p). Now the latter is equivalent to the existence of < fc < t such that j = £~-^r' /t + kp 
(mod q). 

These considerations lead to the following formula. If r + to — 6{x) ^ (mod t), then let 
hj^{x = r (mod q)) = ±. Let otherwise s — £^^{r + m — 6{x))/t and 

i-l 

hj^ {x = r {mod q)) — h'g^{T)A\J x = s + kp (mod q). 

k=Q 

It is easy to see the syntactic property hg^(x = r (mod q)) <jr = r (mod g)) for all 
fragments T E^. 

Note that if h is length-preserving, then h'^^{x = r (mod q)) — /i^^(T) A x = r (mod q) 
because to = 1, i = 1 and s — r. In this case h'^^{x = r (mod g)) <jr [x = r (mod g)) for 
all fragments T. □ 

Lemma 17 Consider the atomic formula x E X for some x E Wi and X E Y2. Let V 
be a set of variables with FY{x E X) C V . Then for all morphisms h : B* —5- A* and all 
S : Vi — >p N there exists a formula hg^{x E X) which for all MSO-stable fragments J- and 
all n > maxbgB 1^(^)1 satisfies 

hj^ix E X) <jr [x E X) and 
lhj\x E X)Ib,v^ = hj\lx E XU,v)- 

Moreover, if h is length-reducing, then hg^{x E X) <jr (a; e X) for all fragments J- . 

Proof: Let « be a position of h{w) with /i-coordinates {j, d). Then, by definition of hs, we see 
-^ha (!«)(*) = (a, J) for some J CV with X e J if and only if Xwij) — {b, J') for some J' CVn 
with Xd E J'. Now, if 5{x) ^ {1, ... ,7V} where N = maxtgs 1^(^)1, then hj^{x E X) = ±; 
otherwise let 

hj\xEX) = hg\T) AxEXsi^). 

This formula is easily seen to satisfy the syntactic property hg^{x E X) <jr [x E X) for all 
MSO-stable fragments J^. Moreover, if h is length-reducing, then clearly hj^{x E X) <jr 
{x E X) for all fragments T; notice that Xi = X hy definition of y„. □ 

Next, we give the following "lifting lemmas": If for some formulae the inverse morphic 
images are definable, then so are the inverse morphic images of their Boolean combinations 
(Lemma [TH]), their first-order and second-order quantification (Lemma 1191 and Lemma I20L 
respectively) and their modular counting quantification (Lemma I2ip . Moreover, the con- 
struction respects every family of morphisms and every collection of fragments (in the case 
of second-order quantification every collection of MSO-stable fragments; and for the modular 
counting quantifier every collection of mod-stable fragments). 
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Lemma 18 Let ijj be one of the formulae -^ipi or ipi V ip2 or ipi A ip2- Let ^ be a family 

of fragments and let C be a family of morphisms between finitely generated free monoids. 
Suppose for all sets of variables V 3 FV(^i), all C-morphisms h : B* ^ A* and all 5 : Vi — >p 
N there exist formulae hj^{ipi), i G {1,2}, which for all G ^ and all n > max(,gs \h{b)\ 
satisfy 

hj^ifi) <^ and 

Let V be a set of variables with FV(V') C V. Then for all C-morphisms h : B* ^ A* and all 
S : Vi — >-p N there exists a formula hj^{tp) which for all G ^ and all n > max(,gs \h{b)\ 
satisfies 

h'^^{ip) <jr ip and 
Proof: The formulae for disjunction and conjunction are straightforward: 

For the negation, we have to take more care. For w € [TJb^v;^ we may have hs{w) ^ 
[¥'i]A,y simply because hs{w) ^ |T]/i.y. This may happen, e.g., if S is such that hs(w) 
does not allow to interpret all first-order variables of V. Therefore l~^hj^ {ipi)}B,v„ is not a 
subset of |T]s_y^ in general. This is enforced by a conjunction with hJ^{T) and we let 

It is easy to verify that the syntactic property of the hg^{ipi) conveys to hj^{tp), i.e., we 
have h^^{'ip) <jr ^ for all G^. Note that -^hg^{ipi) <jr -.(^1. □ 

Lemma 19 Consider the formulae 3x ip a,nd \/x for some x G Vi . Let ^ he a family 
of fragments and let C be a family of morphisms between finitely generated free monoids. 
Suppose for all sets of variables V 3 FV(<^i), for all C-morphisms h : B* ^ A* and all 
S : Vi — >-p N there exists a formula h'^^{ip) which for all T G^ and all n > max(,gs \h{b)\ 
satisfies 

hj^{(p) <jr ip and 

lhi\v)lB,V^=hj\MA,v)- 

Let V be a set of variables with FV(3x ip) = FV(V.t p) C V . Then for all C-morphisms 
h : B* ^ A* and all S : Vi — >-p N there exist formulae hj^{3x if) and hj^{\/x ip) which for 
all G^ and all n > maxbgs \h{b)\ satisfy the following properties: 

hj\3x p) <^ 3x ^, lhj\3x (^)]B,y„ = hj\l3x 
h'g\\/x p) <jr Mx ip, V^^(^x p)Ib,v„ = /i^^([Vx p\a,v)- 
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Proof: Letibeapositionof with/i-coordinates(j, rf). Th.enh5{w)[x/i] = 
where 5[a;/c?] is given hy x ^ d and y ^ S{y) iiy ^ x. This leads to 

N 

h^\3x^) = 3x \J h^l^f^{^) 

N 

where N = maxfcgs \h{b)\ and hg^^^^{(p) is the formula from the premise for the mapping 
hs[x/d] and set of variables V U {x}. Note that for the /i-coordinates {j, d) of every position 
of h{'w) we have 1 < d < N hy choice of N. The syntactic properties hj^{3x ip) <jr 3x if 
and hj^{yx ip) <j: Vx (p for all € 5^ are easily verified. □ 

Lemma 20 Consider the formulae 3X ip and \/X (p for some X S V2. Let ^ be a family 
of fragments and let C be a family of morphisms between finitely generated free monoids. 
Suppose for all sets of variables V 3 FY{(p), for all C-morphisms h : B* ^ A* and all 
6 : Vi -^p N there exists a formula hj^{p) which for all F & ^ and all n > max(,gs \h{b)\ 
satisfies 

hj^{(p) <^ ip and 
lhi\v)}B,v^ = hj\MA,v). 

Let V be a, set of variables with FV(EIXi^) = FV(VX(/?) C V. Then for all C-morphisms 
h : B* A* and all S : Vi — >-p N there exist formulae hg^{3X p) and h'^^i3X ip) which for 
all MSO-stable fragments T € ^ and all n > maxbgs \ h{b)\ satisfy the following properties: 

hj\3Xp) <^ 3Xp, lh^\3X^)lB,v^ = h-,\l3XplA,v). 
hj\yXip) <^^Xip, lhj\yx ip)jB,v^ = hi\lWXpiAy). 

Moreover, if h is length-reducing, then hJ^{3X ip) <jr 3X ip and hJ^{\/Xip) <jr MXp for 
all fragments J- & ^. 

Proof: Let i be a position of /i5('u;) with /i-coordinates (j, d). Then A/i^(^)(i) € Ax({X}U2^) 
if \y,{j) eB X {{Xd} U 2^") by definition of hs- Hence we let 

hj\3Xp) = 3Xi---3Xn hj\p), 

hj\yxp) = yxi---yxN hj\p) 

where N = max({l} U {|/i(&)| | b e B}) and hj^{p) is the formula from the premise for 
the set of variables V U {X}. Note that for the /i-coordinatcs {j,d) of every position of 
h{w) we have 1 < d < N hy choice of N. It is easily verified that the syntactic properties 
hj\3X p) <jr 3X p and hJ^{\/X p) <jr \fX p hold for J" e 5 if J" is MSO-stable or if h is 
length-reducing. Notice that Xi = X hy definition of y„, and that if h is length-reducing, 
then N = 1. □ 
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The following lemma lifts the inverse morphism construction to modular counting quan- 
tifiers. It in particular applies to mod-stable fragments, but holds in a more general setting. 
Specifically, we do not need the closure under negation required for mod-stability. 

Lemma 21 Consider the formula 3^ '"""^ "^cc (p for some a; £ Vi and some r, g G Z. Let ^ 
be a family of fragments and let C be a family of morphisms between finitely generated free 
monoids. Suppose for all sets of variables V 3 FY{(p), for all C -morphisms h : B* A* and 
all S : Vi — >p N there exists a formula hj^{(p) which for all J- E d md all n > max^g b \h{b)\ 
satisfies 

hg^{ip) <jr tp and 

LetV be a set of variables wit/i FV(3'' p) C V. Then for all C -morphisms h : B* A* 
and all 5 : Vi -^p N there exists a formula h^^i^"^ '°^°'^'^x ip) which for all J- <E ^ with 
(3'' ix if) =jr (3" ix (p) (for alls El) and all n > maxbes \Hb)\ satisfies 

^-Ipr mod <^ J- mod ^ 

j^-lp. mod q^ ^ /^-l(p. mod q^ ^j^^^)^ 

Moreover, if h is length-reducing, then hj^{3'' '""'^ ^x ip) <jr 3'' '""^^ «x p for all J" £ S'- 

Proof: For d G N by 6[x/d\ : Vi — >-p N we denote the mapping x i-^ d and y S{y) ii y ^ x. 
Let ^^jaj/d] formula from the premise for the set of variables V U {x}. Let 

N 

hj\3^^°''^x^) = V /\3'^('^)-°^'x 
ses d=i 

where N — maxbgs \h{b)\ and S is the set of functions s : {1, . . . ,N} -> {0, . . . , q — 1} such 
that X^dLi ■s(c^) = ^ (mod q). A position is a ip-position if p holds when x is interpreted 
to be this position. The idea is that the (/3-positions of hs{w) are partitioned; for every 
d e {!,..., N} the number of (^-positions of hs{w) originating from a position in w with 
offset d is counted separately (modulo q). The total sum of these counts then has to be r 
(modulo q). Note that 5* is finite and that for the /i-coordinates {j, d) of every position oih{w) 
we have 1 < d < iV by choice of N. Hence, every t/j-position of hs{w) is counted in precisely 
one of the terms of the conjunction. The syntactic property /i^^(3'' '^x (p) <jr 3'' '^x p 
for all G S' with (3'' ""'i «x p) =t (3' '""'^ ix p) is easily verified. 

Suppose h is length- reducing, i.e., N < 1. Consider first the case iV = 0. If r = 
(mod q), then /i^^(3'' ""^^ «a; p) = T and else hj\3'' ''x p) ^ ±. M N = 1, then S 
contains only the function s with s(l) = (r mod g) and we redefine hj^{3^ '^x p) = 
gr-modg^ ft"! (<p). In both cases the formulae satisfy /i^^(3'- '""'i 'a; p}) <jr 3'' % pi 
for all J" e 5-. □ 

We are now ready to prove Proposition [5] 

Proposition [2j Let J- be a fragment and let C be a family of morphisms between finitely 
generated free monoids. Suppose the following: 

1. If F contains a second-order quantifier, then T is MSO-stable or all C-morphisms are 
length-reducing. 
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2. If J' contains the predicate < or <, then T is order-stable or all C-morphisms are 
length-reducing. 

3. If T contains the predicate sue, min, max or empty, then all C-morphisms are non- 
erasing. 

4. If T contains a modular predicate, then all C-morphisms are length-multiplying and 
either J- is mod-stable or all C-morphisms are length-preserving. 

5. If T contains a modular quantifier, then J- is mod-stable or all C-morphisms are 
length-reducing. 

Then the class of languages defined by T is closed under inverse C-morphisms. 

Proof: We show by induction on the structure of that for all sets of variables V containing 
FY{(p) for all C-morphisms h : B* ^ A* and for all S : Vi — s-p N there exists a formula hj^{ip) 
which satisfies hj^{ip) <jr ip and {hj^ {ip)}B,v^ = ^^^d'^'l-^.^) — max^gs For 

the atomic modalities T, _L, X{x) = a and x = y this is Lemma 1131 for x < y and x < y 
it is Lemma 1 141 for suc(a;,?/), min(a;), max(x) and empty it is Lemma 1151 for the modular 
predicate x = r (mod q) it is Lemma [121 and for a; G X it is Lemma \T7\ Note that in 
all cases the lemmas do apply. For Boolean connectives, first-order quantification, second- 
order quantification and modular quantification, this follows by induction and Lemma 1181 
LemmafTOl LemmaBUland LemmaHU respectively (where ^ = {J-})- With this claim closure 
under inverse morphic images follows readily: Suppose is a sentences and let h~^{ip) = 
hj^{(p) for some arbitrary 5. Then we get /i~^(L^((^)) = LB{h~^{(p)) and h~^{ip) <jr ip. In 
particular ip € F implies h~^{ip) G T, that is, if L e £^(J"), then h~^{L) £ Cb{T). □ 



E. Remaining Proofs from Section S] 

Corollary [TJ Let T C MSO[<, <, =] be a fragment which is MSO-stable and order-stable. 
Then J- defines a positive * -variety. 

Proof: This is a direct consequence of Theorem [T] □ 

Corollary [2j Let J- C MSO[<, <, =, sue, min, max] be a fragment which is MSO-stable 
and order-stable. Suppose min(?;) <jr suc(a;, y) and max(a;) <jr svLc(x,y) for all first- order 
variables x and y. Then the languages defined by J- over nonempty words is a positive +- 
variety. 

Proof: Let L — La{^) n be the language defined by e J^a over A+. We first show 
closure under left residuals. Let Q be the smallest fragment containing and satisfying 
empty <g min(a;) and empty <g max(a;) for all first-order variables x. Then G is suc-stable 
and MSO-stable. Of course, LAif) & jCa{G) and hence a~^LA{'p) G Ca{G) via some formula 
a^^ifi G G because C{G) is closed under residuals by Proposition [TJ Replacing in a~-^ip each 
predicate empty by _L to obtain il>, we get ip <jr a~^ip. Hence ip ^ G which implies ^ G J^. 
But Ip and a~^'P are equivalent over nonempty words, i.e, La{iP) n = LA{a~^(p) fl = 
a~^LA{'p) n — a~^L n . This shows that ip defines the left residual by a of L over 
A+. 

Closure under right residuals follows by symmetry and closure of C{J-) under inverse 
non-erasing morphisms is an immediate consequence of Proposition [21 □ 
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F. Proof of Theorem [2] 



Theorem [2j A language is definable in FO^[<] if and only if it is definable by a formula 
in S2[<,<] with an acyclic comparison graph. 

Proofi Suppose L <Z A* is FO^[<]-definable. Then L is a finite union of unambiguous 
monomials, i.e., languages of the form P — A^ai ■ ■ • A*a„A*^]^ such that each w G P has 
a unique factorization w = wioi ■ ■ ■ WnanWn+i with Wi € A*; see [501 E]- The parameter n 
is the degree of P. There exists i £ {1, . . . ,n} such that Oi ^ Ai tl An+i because otherwise 
(ai • • ■ On)'^ admits two different factorizations. Suppose Oi ^ Ai and let a = ai. Making 
the first occurrence of a explicit in P shows that P is a finite union of languages of the 
form QiaQ2 where Qi and Q2 are unambiguous monomials with degree smaller than k. 
Inductively, there exist ipj £ S2[<,<] with acyclic comparison graph such that fj defines 
Qj. We may assume that the variables used by (/Si and (p2 are disjoint. The next step is to 
relativize (pi to the factor to the left of the first a-position. For this let 

ip[ = 3x13x2 ((/Ji(<) A /\ A(a;j) = aAVy (xj <2/V A(y) ^a) (1) 

je{1.2} 

where xi,X2,y are new variables. Before turning to the formula <^i(<) note that both Xi 
and X2 specify the first a-position thus ensuring xi — X2 without actually using equality. 
(This will become important for acyclicity.) Also take notice that X{y) ^ a can be expressed 
positively by \{y) £ A \ {a}. The construction of fi{<) is by induction on the structure of 
the formula. Let V(<) = V' for atomic formulae, (^V)(<) — ~'V'(<); ('/'i V'2)(<) = ipi{< 
) XX V'2(<) for XX £ {V, A} and 

(3z ?/')(<) = 3z {z < xi Aip{<)), 
(VzV)(<) = Vz(2;2 <zVi/;(<)). 

The formula (p[ holds on a word if and only if the word has an a-position and ipi holds on 
the factor before the first a-position. One can verify that (p[ is acyclic since ipi is. A similar 
construction yields an acyclic formula (p2 which evaluates (p2 on the factor beyond the first 
a-position. This shows that QiaQ2 is defined by the acyclic formula ip[ A ip2. Hence, P is 
a disjunction of acyclic formulae which, after renaming variables, yields an acyclic formula. 
The construction for a.i ^ An+i is similar but the Xj in ^ then specify the last a-position, 
i.e., "xj < y" is replaced by "y < Xj" . 

Let L C A* he defined hy ip £ 'S,2[<,<] with G{ip) = {V,E) acyclic. Suppose ip is in 
prenex form, i.e., ip — 3xi ■ ■ ■ ^Xk^yk+i ■ ■ • V?/^ "0 where ip is quantifier-free. We shall show 
that p(uv)'^u(uv)'^q £ L <^ p{uv)'^'^q £ L for n > £^ and u, v,p,q £ A* . From this it follows 
that the syntactic monoid of L is in DA which is known to be equivalent to L being definable 
in F02[<], see [inil5]. 

The implication p{uv)^''^q £ L{p) ^ p{uvy''u{uv)"q £ L{p) holds for all p £ S2[<,<] 
without acyclicity condition (201 iSj • It therefore suffices to show the converse implication. 
We may assume that is a linear order on V such that + 1) £ E ioi 1 < j < k 
and for k < j < £. For simplicity, we identify variables with their interpretation on a word. 
Consider an interpretation xi, . . . ,Xk G N on p(uv)"'u{uv)"'q such that for all interpretations 
of the yj the formula tp holds on p{uv)"'u{uv)"'q. By choice of n, there exists a factorization 
p{uv)^u{uv)^q — p' {uvYw{uvY q' with p'{uvY being a prefix of p{uvY a-nd {uvYq' being a 
suffix of {uvYq and such that none of the xj is in one of the factors {uvY of this factorization. 
Specifically, Xj e /i U /2 U /s for 1 < j < fc where 
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. 7i = {l,...,|p'|}, 

• 72 = {\p'{uvY I + 1, . . . ,\p'{uvYw I}, and 

• -^3 = {\p'{uvYw{uvY I + 1, . . . , \p{uv)"u{uv)"q\} . 

Let p{uvY'^q = p'{uvYw'{uvYq' and let I[, I2, and be defined analogously to Ii, I2 and 
Is, respectively, with w replaced by w'. Let tt : N ^ N be an order-respecting injection 
mapping Ij to /j, j e {1, 2, 3}. 

Wc construct an interpretation Xj of the xj on p{uvY^q as follows. For Xj e Ji U/3 we set 
x'j = TT{xj ). For the positions xj G /2 we let x'j € /2 such that Xj and ccj have the same label 
and such that j < j' implies Xj < x'-, . Assume there exists an interpretation y'^j^-^ , . . . , 
on p{mjY"q for yu+i, ■ ■ ■ ,ye making ijj false. Wc arc going to construct an interpretation 
of the yj on p{uv)^u{uv)^q such that V is false. If y'^ G U /g, then yj = Every 
y'j ^ /( U/^ is classified into "left" or "right" as follows. If y'^ < min [y'-, e | j' < k], then 
j/j- is "left". If y'j > max [y'y G I2 \ j' < k], then y'^ is "right". Else y'y < y'j < y'y, for some 
< k. We distinguish three cases: 

1. If G E and G .E, then y'^ is "left". 

2. If e and {3,3") e i;, then y^. is "right". 

3. If G and G E, then y^. is "right". 

(In the third case the classification does not really matter.) Note that by acyclicity, (j", j) G 
E and G E cannot happen. The "left" (respectively "right") positions are set 

label-respecting in the range between \p'\ + 1 and |p'(ut;)^| (between |p'('Ut;)^ti;| + 1 and 
\p' {uvYw{uvY\, respectively) such that yj < yj> for "left" -positions (respectively "right"- 
positions) y'j and y'^, with j' < j. By construction every atomic formula which is true on 
p{uv)^u{uv)"q is also true on p{uvY'^q. Since V' does not contain negations, it is monotonic 
in its atoms and thus V' is false on p{uv)"u{uv)"q for this valuation, a contradiction. □ 
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